COMPUTO

A new born academic journal promoting reproducibility

2022 Toronto Workshop on Reproducibility

Facts in Stats/ML academic publications

  • Multiplication of academic journals and articles
    → at the expense of quality / scientific accuracy?
  • Lack of valorisation of computer/algorithmic developments and case studies
  • Saturation of existing solutions
  • Rapid evolution of tools
    • Great API (Rstudio, VScode)
    • Scientific Publishing system (quarto, Jupyter)
    • Integration / dynamics (git(hub), Actions)

Computo’s Board Positionong

Need to renew the mode of publication of scientific knowledge and know-how

Existings solutions

Standard academic journals

Statistics and Computing, Computational Statistics and Data Analysis, Journal of Computational and Graphical Statistics, JMLR, JRSS-B, JASA, …

Limitations

Fixed format (non-dynamic typically, PDF) limiting reproducibility

Software journals

R journal, Journal of Statistical Software, Journal of Open Source Software, JMLR Machine Learning Open Source Software, ROpen-Sci…

Limitations

Congestion, language-centric, software documentation, not structured around a scientific question

Computo in a nutshell

Aims and Scope

Promote contributions in statistics and machine learning that provide insight into which models/methods are the most appropriate to a specific question.

Open and reproducible

  • reproducibility of numerical results is a necessary condition for publication
  • all necessary data and code must be available.
  • reviews are open (reviewers can remain anonymous)

Assessing reproducibility

At the submission stage!

Editorial Board

Chloé-Agathe Azencott

Machine Learning for therapeutic research
Mines ParisTech, Inserm, Institut Curie

Pierre Neuvial

Statistics,
CNRS, Institut de Mathématiques de Toulouse

Julien Chiquet (Chief Editor)

Statistical learning for life science
Université Paris-Saclay, AgroParisTech, INRAE

Nelle Varoquaux

Machine learning and causal inference for genomics
CNRS, Université Grenoble Alpes

How does Computo work? (1/2)

1. Advanced notebook System

https://quarto.org (embed Jupyter and RMarkdown)

  • Code (Python/Julia/R)
  • Math (\(\LaTeX\))
  • Biblio (\(bib\TeX\))
  • Interactivity (HTML widget, CSS)

2. Git repository and services

github/github-action

  • Continuous Integration (reproducibility script)
  • Projects Management (submission, publication)
  • Issues (reviewing, discussion)

How does Computo work? (2/2)

3. Container service

binder

  • Easy to customize
  • Easy to interface with github

4. Reviewing system

Scholastica

  • Discussion among Editorial board
  • Reviewer Invitation
  • Anonymous exchanges between authors/reviewers

→ Eventually published (as Issues)

Author point of view (1/3)

Step 0: setup a github repository

Copy our template repository to use it as a starter

Step 1. write your contribution

Write your notebook as usual (Same spirit as Jupyter/Rmarkdown).

Step 2: configure your binder environment

file environment.yml

name: computorbuild
channels:
  - conda-forge
dependencies:
  - jupyter
  - numpy
  - r-base=4.1.1

Author point of view (2/3)

Step 3: proof reproducibility

A git push will trigger build process on github

name: build
on: push
jobs:
  computorticle:
    runs-on: ubuntu-latest
    steps: # [...]
      - name: Installing dependencies with Miniconda
        uses: conda-incubator/setup-miniconda@v2
          environment-file: environment.yml
        # [...]
      - name: Rendering with Quarto
        run: quarto render content.qmd
        # [...]
      - name: Deploying article on github pages
        # [...]

Author point of view (3/3)

Step 4. submit

If the build process is successful,

  • An HTML version is pushed on a github-page
  • A PDF version can be obtained via chrome-print
  • a binder repos can be associated

→ Submit the PDF on Scholastica page

See our quarto template for more

Author point of view: summary

Editor point of view

Once the reviewing process has ended (successfully!)

  • Copy the author’s repository
  • Format to the final version
  • Publish reviews as issues
  • Add entry on the Journal web site referring
    • github repository
    • data repository
    • reviewing

See our mock contribution for more

Full process overview

Some inspiring initiatives

Distill

https://distill.pub, a journal essentially in Machine/Deep-learning

ReScience

https://rescience.github.io/, a journal publishing “Remake/Redo” of existing works to prove reproducibility

Peer Community-In (aka PCI)

https://peercommunityin.org/, a Free recommendation process of scientific preprints based on peer reviews

  • Work with communities of recommenders
  • We plan to create a PCI for Computo to bypass Scholastica